经验和实验证据表明,人工智能算法学会收取超竞争价格。在本文中,我们开发了一种理论模型来通过自适应学习算法研究勾结。使用流体近似技术,我们表征了一般游戏的连续时间学习成果,并确定勾结的主要驱动力:协调偏见。在一个简单的主导策略游戏中,我们展示了算法估计之间的相关性如何导致持续的偏见,从长远来看持续犯罪行动。我们证明,使用反事实收益来告知其更新的算法避免了这种偏见并融合了主导策略。我们设计了一种带有反馈的机制:设计师揭示了事前信息以帮助反事实计算。我们表明,这种机制实现了社会最佳。最后,我们将我们的框架应用于文献中研究和拍卖的两个模拟,并分析结果合理化。
translated by 谷歌翻译
Structural Health Monitoring (SHM) describes a process for inferring quantifiable metrics of structural condition, which can serve as input to support decisions on the operation and maintenance of infrastructure assets. Given the long lifespan of critical structures, this problem can be cast as a sequential decision making problem over prescribed horizons. Partially Observable Markov Decision Processes (POMDPs) offer a formal framework to solve the underlying optimal planning task. However, two issues can undermine the POMDP solutions. Firstly, the need for a model that can adequately describe the evolution of the structural condition under deterioration or corrective actions and, secondly, the non-trivial task of recovery of the observation process parameters from available monitoring data. Despite these potential challenges, the adopted POMDP models do not typically account for uncertainty on model parameters, leading to solutions which can be unrealistically confident. In this work, we address both key issues. We present a framework to estimate POMDP transition and observation model parameters directly from available data, via Markov Chain Monte Carlo (MCMC) sampling of a Hidden Markov Model (HMM) conditioned on actions. The MCMC inference estimates distributions of the involved model parameters. We then form and solve the POMDP problem by exploiting the inferred distributions, to derive solutions that are robust to model uncertainty. We successfully apply our approach on maintenance planning for railway track assets on the basis of a "fractal value" indicator, which is computed from actual railway monitoring data.
translated by 谷歌翻译
The polynomial kernels are widely used in machine learning and they are one of the default choices to develop kernel-based classification and regression models. However, they are rarely used and considered in numerical analysis due to their lack of strict positive definiteness. In particular they do not enjoy the usual property of unisolvency for arbitrary point sets, which is one of the key properties used to build kernel-based interpolation methods. This paper is devoted to establish some initial results for the study of these kernels, and their related interpolation algorithms, in the context of approximation theory. We will first prove necessary and sufficient conditions on point sets which guarantee the existence and uniqueness of an interpolant. We will then study the Reproducing Kernel Hilbert Spaces (or native spaces) of these kernels and their norms, and provide inclusion relations between spaces corresponding to different kernel parameters. With these spaces at hand, it will be further possible to derive generic error estimates which apply to sufficiently smooth functions, thus escaping the native space. Finally, we will show how to employ an efficient stable algorithm to these kernels to obtain accurate interpolants, and we will test them in some numerical experiment. After this analysis several computational and theoretical aspects remain open, and we will outline possible further research directions in a concluding section. This work builds some bridges between kernel and polynomial interpolation, two topics to which the authors, to different extents, have been introduced under the supervision or through the work of Stefano De Marchi. For this reason, they wish to dedicate this work to him in the occasion of his 60th birthday.
translated by 谷歌翻译
The proliferation of deep learning techniques led to a wide range of advanced analytics applications in important business areas such as predictive maintenance or product recommendation. However, as the effectiveness of advanced analytics naturally depends on the availability of sufficient data, an organization's ability to exploit the benefits might be restricted by limited data or likewise data access. These challenges could force organizations to spend substantial amounts of money on data, accept constrained analytics capacities, or even turn into a showstopper for analytics projects. Against this backdrop, recent advances in deep learning to generate synthetic data may help to overcome these barriers. Despite its great potential, however, synthetic data are rarely employed. Therefore, we present a taxonomy highlighting the various facets of deploying synthetic data for advanced analytics systems. Furthermore, we identify typical application scenarios for synthetic data to assess the current state of adoption and thereby unveil missed opportunities to pave the way for further research.
translated by 谷歌翻译
To make machine learning (ML) sustainable and apt to run on the diverse devices where relevant data is, it is essential to compress ML models as needed, while still meeting the required learning quality and time performance. However, how much and when an ML model should be compressed, and {\em where} its training should be executed, are hard decisions to make, as they depend on the model itself, the resources of the available nodes, and the data such nodes own. Existing studies focus on each of those aspects individually, however, they do not account for how such decisions can be made jointly and adapted to one another. In this work, we model the network system focusing on the training of DNNs, formalize the above multi-dimensional problem, and, given its NP-hardness, formulate an approximate dynamic programming problem that we solve through the PACT algorithmic framework. Importantly, PACT leverages a time-expanded graph representing the learning process, and a data-driven and theoretical approach for the prediction of the loss evolution to be expected as a consequence of training decisions. We prove that PACT's solutions can get as close to the optimum as desired, at the cost of an increased time complexity, and that, in any case, such complexity is polynomial. Numerical results also show that, even under the most disadvantageous settings, PACT outperforms state-of-the-art alternatives and closely matches the optimal energy cost.
translated by 谷歌翻译
Learned Bloom Filters, i.e., models induced from data via machine learning techniques and solving the approximate set membership problem, have recently been introduced with the aim of enhancing the performance of standard Bloom Filters, with special focus on space occupancy. Unlike in the classical case, the "complexity" of the data used to build the filter might heavily impact on its performance. Therefore, here we propose the first in-depth analysis, to the best of our knowledge, for the performance assessment of a given Learned Bloom Filter, in conjunction with a given classifier, on a dataset of a given classification complexity. Indeed, we propose a novel methodology, supported by software, for designing, analyzing and implementing Learned Bloom Filters in function of specific constraints on their multi-criteria nature (that is, constraints involving space efficiency, false positive rate, and reject time). Our experiments show that the proposed methodology and the supporting software are valid and useful: we find out that only two classifiers have desirable properties in relation to problems with different data complexity, and, interestingly, none of them has been considered so far in the literature. We also experimentally show that the Sandwiched variant of Learned Bloom filters is the most robust to data complexity and classifier performance variability, as well as those usually having smaller reject times. The software can be readily used to test new Learned Bloom Filter proposals, which can be compared with the best ones identified here.
translated by 谷歌翻译
Every automaton can be decomposed into a cascade of basic automata. This is the Prime Decomposition Theorem by Krohn and Rhodes. We show that cascades allow for describing the sample complexity of automata in terms of their components. In particular, we show that the sample complexity is linear in the number of components and the maximum complexity of a single component, modulo logarithmic factors. This opens to the possibility of learning automata representing large dynamical systems consisting of many parts interacting with each other. It is in sharp contrast with the established understanding of the sample complexity of automata, described in terms of the overall number of states and input letters, which implies that it is only possible to learn automata where the number of states is linear in the amount of data available. Instead our results show that one can learn automata with a number of states that is exponential in the amount of data available.
translated by 谷歌翻译
Understanding how biological neural networks carry out learning using spike-based local plasticity mechanisms can lead to the development of powerful, energy-efficient, and adaptive neuromorphic processing systems. A large number of spike-based learning models have recently been proposed following different approaches. However, it is difficult to assess if and how they could be mapped onto neuromorphic hardware, and to compare their features and ease of implementation. To this end, in this survey, we provide a comprehensive overview of representative brain-inspired synaptic plasticity models and mixed-signal CMOS neuromorphic circuits within a unified framework. We review historical, bottom-up, and top-down approaches to modeling synaptic plasticity, and we identify computational primitives that can support low-latency and low-power hardware implementations of spike-based learning rules. We provide a common definition of a locality principle based on pre- and post-synaptic neuron information, which we propose as a fundamental requirement for physical implementations of synaptic plasticity. Based on this principle, we compare the properties of these models within the same framework, and describe the mixed-signal electronic circuits that implement their computing primitives, pointing out how these building blocks enable efficient on-chip and online learning in neuromorphic processing systems.
translated by 谷歌翻译
深度神经网络需要特定的层来处理点云,因为点的分散和不规则位置使我们无法使用卷积过滤器。在这里,我们介绍了复合层,该复合层是点云的新卷积操作员。我们的复合层的特殊性是,它在将点与其特征向量结合之前从点位置提取和压缩空间信息。与众所周知的点横向跨层相比,我们的复合层提供了额外的正则化,并确保了参数和参数数量方面的灵活性更大。为了展示设计灵活性,我们还定义了一个集合复合层,该复合层以非线性方式组合空间信息和特征,并且我们使用这些层来实现卷积和聚集的综合材料。我们训练我们的复合烯类进行分类,最引人注目的是无监督的异常检测。我们对合成和现实世界数据集的实验表明,在这两个任务中,我们的CompositeNets都优于表现要点,尽管具有更简单的体系结构,但取得了与KPCONV相似的结果。此外,我们的复合烯类基本上优于现有的解决方案,用于点云上的异常检测。
translated by 谷歌翻译
我们考虑根据视觉检测自动移动机器人异常的任务。我们对相关类型的视觉异常进行分类,并讨论如何通过无监督的深度学习方法检测到它们。我们提出了一个专门为此任务构建的新型数据集,并在该任务上测试了最先进的方法。我们终于在实际情况下讨论部署。
translated by 谷歌翻译